This tutorial is heavily based on the Appendices and Supporting Information found in the Open Access {aniMotum} paper:

  • Jonsen ID, Grecian WJ, Phillips L, Carrol G, McMahon C, Harcourt RG, Hindell MA & Patterson TA (2023) aniMotum, an R package for animal movement data: Rapid quality control, behavioural estimation and simulation. Methods in Ecology and Evolution 14(3): 806-816 https://doi.org/10.1111/2041-210X.14060

Please refer to the paper, appendices and the R package documentation for a comprehensive overview of {aniMotum}.


0.1 Before we begin

Please download the {aniMotum} package from R-Universe.

# install from my R-universe repository
install.packages("aniMotum", 
                 repos = c("https://cloud.r-project.org",
                 "https://ianjonsen.r-universe.dev"),
                 dependencies = TRUE)

This installs all Imported and Suggested R packages from CRAN and R-universe. If queried, answer Yes to install the source version. Note, if you haven’t installed {aniMotum} previously then installation of dependent packages may take a while, especially if many of them need to be compiled. You should only need to go through this once, subsequent installation of {aniMotum} updates will be much faster.

Alternatively you can download a binary version of aniMotum here: https://ianjonsen.r-universe.dev/aniMotum

For full instructions see the aniMotum homepage on GitHub: https://github.com/ianjonsen/aniMotum


1 Introduction

The aim of this practical is to give you an understanding of how we can use the {aniMotum} R package to process and analyse animal movement data. My aim is that you can go on to apply similar workflows to your own animal movement data.

During this practical we will:

  1. Load in an example data set
  2. Fit a state-space model to regularise and error correct a movement path
  3. Visualise the model fit
  4. Fit a move persistence model to estimate behaviour
  5. Extract the regularised data
  6. Manual mapping
  7. Simulate from the model
  8. Re-route movement paths

This practical will be based on the {tidyverse} style of R coding. You will become familiar with using pipes |> to perform data manipulations, using ggplot to generate publication quality figures, and using the {sf} package to handle spatial data.

For more information on the {tidyverse} check out the Wickham & Grolemund book ‘R for Data Science’. You can access it for free online here:
https://r4ds.had.co.nz

The project website can be accessed here:
https://www.tidyverse.org

For more information on the {sf} project check out https://r-spatial.github.io/sf/

As discussed in the presentation we can pass data from several different tag types to {aniMotum}. For an outline of how {aniMotum} expects this data to be formatted see: https://ianjonsen.github.io/aniMotum/articles/Overview.html


2 Practical

In March 2019 I was part of an expedition to deploy Satellite Relay Data Loggers on harp seal pups in the Gulf of St Lawrence, Canada. These devices are satellite linked computers capable of recording a wide range of information and transmitting it via the ARGOS network. Here we will focus on the location data recorded by the ARGOS satellites when communicating with three of the tags.

Harp seals give birth on the sea ice when it is at it’s most southerly extent in late Winter/ early Spring. Females wean their pup after around 2 weeks. We deployed tags on pups that were approximately 3 weeks old, and tracked their movements as they left the sea ice and began their first migration north.

This work was published in Royal Society Open Science and is available Open Access here:

  • Grecian WJ, Stenson GB, Biuw M, Boehme L, Folkow LP, Goulet PJ, Jonsen ID, Malde A, Nordøy ES, Rosing-Asvid A & Smout S (2022) Environmental drivers of population-level variation in the migratory and diving ontogeny of an Arctic top predator Royal Society Open Science 9: 211042 https://doi.org/10.1098/rsos.211042

2.1 Loading data

Data for 3 animals are available in the repo. Lets load them into R and have a quick look.

# load libraries
library(aniMotum)
library(patchwork)
library(sf)
library(tidyverse)

# load harp seal locations
locs <- read_csv("harps.csv")
locs
## # A tibble: 10,884 × 8
##    id         date                lc      lon   lat  smaj  smin   eor
##    <chr>      <dttm>              <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
##  1 hp6-747-19 2019-03-23 00:08:33 3     -59.6  48.1   269    93    97
##  2 hp6-747-19 2019-03-23 00:25:33 3     -59.6  48.1   338    75    69
##  3 hp6-747-19 2019-03-23 00:37:38 1     -59.6  48.1  4984   144    83
##  4 hp6-747-19 2019-03-23 10:20:14 3     -59.7  48.2   358    89   116
##  5 hp6-747-19 2019-03-23 10:22:18 3     -59.7  48.2   239    82    96
##  6 hp6-747-19 2019-03-23 15:07:47 2     -59.7  48.2  1345    76   132
##  7 hp6-747-19 2019-03-23 15:36:07 3     -59.7  48.2   665    48    90
##  8 hp6-747-19 2019-03-23 20:03:42 B     -59.7  48.2  4369   528    42
##  9 hp6-747-19 2019-03-23 20:13:27 2     -59.7  48.3  1890    72    84
## 10 hp6-747-19 2019-03-24 01:25:02 3     -59.7  48.3   271    70    82
## # ℹ 10,874 more rows

The harp data are formatted in a pretty standard way, including Argos’ Least-Squares-based location error classes lc. This is the minimum required by fit_ssm:
- id a unique identifier for each animal (or track) in the data.frame or tibble.
- date a date-time variable in the form YYYY-mm-dd HH:MM:SS (or YYYY/mm/dd HH:MM:SS). These can be text strings, in which case aniMotum converts them to POSIX format and assumes the timezone is UTC.
- lc the location class variable common to Argos data, with classes in the set: 3, 2, 1, 0, A, B, Z.
- lon the longitude variable.
- lat the latitude variable.

The lc values determine the measurement error variances (based on independent data, see Jonsen et al. 2020) used in the SSM’s for each observation.

Since 2011, the default Argos location data uses CLS Argos’ Kalman Filter (KF) algorithm. These data include error ellipse information for each observed location in the form of 3 variables: ellipse semi-major axis length, ellipse semi-minor axis length, and ellipse orientation.

The column names follow those for Argos LS data, with the following additions:
- smaj the Argos error ellipse semi-major axis length (m).
- smin the Argos error ellipse semi-minor axis length (m).
- eor the Argos error ellipse ellipse orientation (degrees from N).

Here, the error ellipse parameters for each observation define the measurement error variances used in the SSM’s (Jonsen et al. 2020). Missing error ellipse values are allowed, in which case, those observations are treated as Argos LS data.

# preliminary plot
p1 <- ggplot() +
  theme_bw() +
  geom_point(aes(x = lon, y = lat), data = locs) +
  facet_wrap(~id)
print(p1)

The plot reveals one track has large gaps while the others have (at least one) obviously erroneous location estimates.

Let’s check how frequently the tags were transmitting. This can guide the frequency of regularised locations estimated by {aniMotum}.

locs |>
  group_by(id) |>
  summarise(mean_interval_hrs = as.numeric(mean(difftime(date, lag(date)), na.rm = T), "hours"),
            max_interval_hrs = as.numeric(max(difftime(date, lag(date)), na.rm = T), "hours"))
## # A tibble: 3 × 3
##   id         mean_interval_hrs max_interval_hrs
##   <chr>                  <dbl>            <dbl>
## 1 hp6-747-19             2.97             303. 
## 2 hp6-749-19             0.936             15.0
## 3 hp6-756-19             1.17              54.3

On average two of the tags were transmitting hourly, with the first tag only transmitting every 3 hours. This difference is probably driven by the large gaps in transmission: 303 hours is 12.5 days!

2.2 Fit a state-space model

We can use {aniMotum} to regularise and error correct these movement paths using a state-space model. For speed/ simplicity we’ll ask for daily locations, but you can easily adjust this depending on your questions. For example, in the paper I dropped the data with large gaps and then used a 6 hour interval to match the locations to the 6 hour dive summary data transmitted by the tags.

We can fit to all animals in the same call using the fit_ssm function. Here I’m fitting a correlated random walk with model = "crw" other options are random walk (model = "rw") and move persistence (model = "mp").

fit <- fit_ssm(locs,
        vmax = 5,
        model = "crw",
        time.step = 24,
        control = ssm_control(verbose = 0))

You can access the results and summary information about the model fit with the summary function:

summary(fit)
##   Animal id Model Time n.obs n.filt n.fit n.pred n.rr converged    AICc
##  hp6-747-19   crw   24  1971    583  1388    245    .      TRUE 22638.1
##  hp6-749-19   crw   24  5073   1339  3734    199    .      TRUE 49318.5
##  hp6-756-19   crw   24  3840   1027  2813    189    .      TRUE 42544.1
## 
## --------------
## hp6-747-19 
## --------------
##  Parameter Estimate Std.Err
##        D_x   1.8448   0.176
##        D_y   4.5803  0.5023
##      rho_p  -0.1793  0.0848
##        psi   8.9861  0.3696
## 
## --------------
## hp6-749-19 
## --------------
##  Parameter Estimate Std.Err
##        D_x   3.3592  0.1955
##        D_y   5.3051  0.3263
##      rho_p   0.0706  0.0491
##        psi   7.6369  0.1835
## 
## --------------
## hp6-756-19 
## --------------
##  Parameter Estimate Std.Err
##        D_x   5.5716  0.3673
##        D_y  23.8166  1.5672
##      rho_p  -0.1603    0.07
##        psi  14.8114   0.438

2.3 Visualise the model fit

We can quickly check the model fit via a 1-D time series plot of the model:

plot(fit,
     what = "predicted",
     type = 1,
     pages = 1)

The fitted points (orange) are overlayed on top of the observations (blue). Note the uncertainty increases when interpolating across the gaps we highlighted earlier. You would need to carefully consider whether to include these interpolated sections in any further analysis.

A 2-D plot of the model:

plot(fit,
     what = "predicted",
     type = 2,
     pages = 1,
     ncol = 3)

Again the 2D plot highlights the uncertainty in hp6-747-19’s path.

We can map the predicted locations straight from the fit_ssm object as follows:

aniMotum::map(fit,
              what = "predicted")
## using map scale: 10

Often when validating models we want to assess residual plots. {aniMotum} offers the option of calculating one-step-ahead prediction residuals. These can be helpful but are computationally demanding. We calculate the residuals via the osar function and can visualise them as a time-series, QQ plot and autocorrelation plot.

res <- osar(fit)
plot(res, type = "ts", pages = 1) | plot(res, type = "qq", pages = 1) | plot(res, type = "acf", pages = 1)

The time series and autocorrelation plots suggest that the crw model is a good fit to the data, although there is some skew in the QQ plot. We could re-run the models as rw or mp and see how this impacts model fit.

2.4 Fit a move persistence model

We may be interested in inferring how an individuals behaviour changes along the movement path. Move persistence is a continuous behavioural index that captures autocorrelation in both speed and direction. For more information on move persistence please read:

  • Jonsen, ID, McMahon, CR, Patterson TA, Auger-Méthé M, Harcourt R, Hindell MA & Bestley S (2019). Movement responses to environment: Fast inference of variation among southern elephant seals with a mixed effects model. Ecology 100: e02566 (https://doi.org/10.1002/ecy.2566)

This is a different approach to discrete states estimated using methods such as hidden Markov models. For information on those see previous EFI/ ESA webinars by Théo Michelot and Vianey Leos Barajas.

Move persistence models can be fit either using fit_mpm or simultaneous to the state-space model fit via the model = "mp" argument in fit_ssm.

If you fit the model with fit_mpm then there is the option to estimate move persistence independently for each animal (model = "mpm"), or sharing a parameter across individuals (model = "jmpm").

Here we fit a joint move persistence model to the three animals and then plot the results using the default plot function.

fmpm <- fit_mpm(fit,
                what = "predicted",
                model = "jmpm",
                control = mpm_control(verbose = 0))
plot(fmpm,
     pages = 1,
     ncol = 2)

You can see each path varies between periods of high and low move persistence. This suggests there are changes in the movements indicative of periods of migration and periods of residency. See the Manual Mapping section for an example of mapping the migration and visualising move persistence.

It is also possible to go on to link move persistence to environmental covariates. Take a look at Jonsen et al. (2019) for the method and Grecian et al. (2022) for an application to this data set.

2.5 Data extraction

You may want to extract the regularised data output from fit_ssm to use in a difference application, or you may want to generate your own maps. You can do this with the grab function:

# extract predicted locations and their move persistence estimates
df_locs <- fit |> grab("predicted")
df_mpm <- fmpm |> grab("fitted")

# combine
df <- df_locs |> left_join(df_mpm)
## Joining with `by = join_by(id, date)`
df
## # A tibble: 633 × 17
##    id     date                  lon   lat      x     y    x.se    y.se         u
##    <chr>  <dttm>              <dbl> <dbl>  <dbl> <dbl>   <dbl>   <dbl>     <dbl>
##  1 hp6-7… 2019-03-23 00:00:00 -59.6  48.1 -6634. 6088. 1.00e-5 1.00e-5 -6.20e-11
##  2 hp6-7… 2019-03-24 00:00:00 -59.7  48.3 -6642. 6123. 2.28e+0 3.56e+0  8.99e- 2
##  3 hp6-7… 2019-03-25 00:00:00 -59.6  48.2 -6629. 6115. 3.13e+0 4.88e+0  1.21e+ 0
##  4 hp6-7… 2019-03-26 00:00:00 -59.5  48.2 -6621. 6108. 3.33e+0 5.17e+0  2.83e- 1
##  5 hp6-7… 2019-03-27 00:00:00 -59.5  48.2 -6620. 6110. 6.60e-1 8.10e-1 -1.02e- 1
##  6 hp6-7… 2019-03-28 00:00:00 -59.5  48.2 -6620. 6109. 3.37e+0 5.26e+0 -1.98e- 1
##  7 hp6-7… 2019-03-29 00:00:00 -59.4  48.3 -6614. 6133. 1.21e+1 6.52e+0  7.02e- 1
##  8 hp6-7… 2019-03-30 00:00:00 -59.3  48.6 -6597. 6177. 2.96e+0 3.89e+0  7.61e- 1
##  9 hp6-7… 2019-03-31 00:00:00 -59.2  48.6 -6591. 6177. 2.28e+0 3.39e+0  2.74e- 2
## 10 hp6-7… 2019-04-01 00:00:00 -59.1  48.7 -6581. 6192. 8.26e-1 1.03e+0  9.55e- 1
## # ℹ 623 more rows
## # ℹ 8 more variables: v <dbl>, u.se <dbl>, v.se <dbl>, s <dbl>, s.se <lgl>,
## #   logit_g <dbl>, logit_g.se <dbl>, g <dbl>

2.6 Manual mapping

To generate a map we need to load a suitable shapefile from the {rnaturalearth} package. These can be loaded into your session as an {sf} object ready to be plotted with geom_sf. I’ve then created an {sf} object manually from the df object containing the regularised locations and their corresponding move persistence estimates.

Given the geographic region, I’ve defined prj as custom Lambert Azimuthal Equal Area projection centred on the study region. We specify the projection via the coord_sf function.

#install.packages("rnaturalearth")
require(rnaturalearth)
## Loading required package: rnaturalearth
# Generate a global shapefile and a simple plot
world <- ne_countries(scale = "medium", returnclass = "sf")

# To generate a plot with less distortion first define a projection i.e. Lambert Azimuthal Equal Area
prj = "+proj=laea +lat_0=60 +lon_0=-50 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0"

# Create an sf version of the locs data with a WGS84 projection and add to the plot
df_sf <- df |> 
  st_as_sf(coords = c('lon', 'lat')) |>
  st_set_crs(4326)

ggplot() +
  theme_bw() +
  geom_sf(aes(), data = world) +
  geom_sf(aes(colour = g), data = df_sf, show.legend = "point") +
  scale_color_viridis_c(expression(gamma[t]), limits = c(0,1)) +
  coord_sf(xlim = c(-2000000, 2000000), ylim = c(-2500000, 2500000), crs = prj, expand = T) +
  scale_x_continuous(breaks = seq(from = -130, to = 30, by = 10)) +
  facet_wrap(~id, nrow = 1)

This nicely illustrates the most likely path of the three animals from the pupping ground in the Gulf of St Lawrence as they migrate north. Colouring the points by move persistence (\(\gamma\)t) highlights periods when the animals were moving faster and more directed (lighter colours) and when the animals were travelling slower and less directed (darker).

Hey there - nice plot!
Hey there - nice plot!

3 Extras

3.1 An example GPS data set

We can also use {aniMotum} to process animal movement data collected by GPS devices or Geolocation sensors. To do this we need to specify a different location class via the lc argument in fit_ssm.

The following little penguin data set is available in the supplementary material to the aniMotum paper. Note the lc column contains the letter G to indicate to {aniMotum} that this is GPS data.

GPS devices will only be able to fix their location when the penguin is at the surface, and so unsuprisingly the GPS tracks contain many gaps when the animals were diving. For a more detailed analysis of the little penguin data please read the original paper alongside the analysis in the aniMotum paper. Here we regularise the location data to 2 minute temporal resolution (fit_ssm expects the input in hours) and calculate move persistence.

lipe <- read_csv("https://raw.githubusercontent.com/ianjonsen/foieGras.paper/main/data/lipe_gps_ex32.csv")
lipe
## # A tibble: 10,698 × 5
##    id      date                lc      lon   lat
##    <chr>   <dttm>              <chr> <dbl> <dbl>
##  1 L013m01 2017-10-12 18:00:29 G      150. -36.3
##  2 L013m01 2017-10-12 18:00:39 G      150. -36.3
##  3 L013m01 2017-10-12 18:00:49 G      150. -36.3
##  4 L013m01 2017-10-12 18:00:59 G      150. -36.3
##  5 L013m01 2017-10-12 18:01:09 G      150. -36.3
##  6 L013m01 2017-10-12 18:01:27 G      150. -36.3
##  7 L013m01 2017-10-12 18:01:37 G      150. -36.3
##  8 L013m01 2017-10-12 18:01:55 G      150. -36.3
##  9 L013m01 2017-10-12 18:02:05 G      150. -36.3
## 10 L013m01 2017-10-12 18:02:23 G      150. -36.3
## # ℹ 10,688 more rows
lipe_ssm <- fit_ssm(lipe,
        vmax = 5,
        min.dt = 5,
        model = "crw",
        time.step = 2/60,
        control = ssm_control(verbose = 0))
lipe_fmpm <- fit_mpm(lipe_ssm,
                what = "predicted",
                model = "jmpm",
                control = mpm_control(verbose = 0))
aniMotum::map(lipe_ssm, what = "predicted") | plot(lipe_fmpm, ncol = 2, pages = 1)
## using map scale: 10

3.2 An example Geolocation data set

When fitting models in R it’s easy to foget the heavy lifting that these models are often doing behind the scenes.

Just a few years ago, a state-space model fit to animal movement data via MCMC would take hours or even days to run. Now with TMB these models take minutes.

This also means it’s easy to throw the data into the model without doing thorough data checks first.

Most issues can be avoided by checking for unique id’s between animals, issues with date time format or missing values before passing any data to fit_ssm.

It’s also important to remember what location class your data have. If they are GPS or

3.3 Simulate from the model

trs <- sim_fit(fit,
               what = "predicted",
               reps = 5)
plot(trs,
     ncol = 3)

plot(fit)

#p2 <- plot(fit, type = 1, what = “predicted”) #p3 <- plot(fit, type = 2, what = “predicted”) #p4 <- aniMotum::map(fit, what = “predicted”)

#fmpm <- fit_mpm(fit)

#xs <- osar(fit) #plot(res, type = “ts”) #plot(res, type = “qq”) #plot(res, type = “acf”)

```

aniMotum can be used to fit a state-space model to Argos PTT, GLS or GPS data.

Have a look at the help file for fit_ssm.

For Argos PTT data you need to provide either the location class, or the semimajor and semiminor axies of the error…

For GLS data replace the lc column with a string of “GL”

For GPS data replace the lc column with a string of “G”

ideally for geolocation data youwould have standard errors associated with each location, which can be passed as columns to XXX.

If you do not have these data then a good starting place is somewhere between 0.25 and 1 deg SE’s for Latitude and .5 that value for Longitude.

For GPS data you can try a small error variance i.e. 10s of metres

Rework the list of things to do so that they match the introductory slides

Prep data Run fit_ssm

etc

What about other sources of data? Geolocation GPS PTT